Spatial Scan Statistics for Graph Clustering

نویسندگان

  • Bei Wang
  • Jeff M. Phillips
  • Robert Schreiber
  • Dennis M. Wilkinson
چکیده

In this paper, we present a measure associated with detection and inference of statistically anomalous clusters of a graph based on the likelihood test of observed and expected edges in a subgraph. This measure is adapted from spatial scan statistics for point sets and provides quantitative assessment for clusters. We discuss some important properties of this statistic and its relation to modularity and Bregman divergences. We apply a simple clustering algorithm to find clusters with large values of this measure in a variety of real-world data sets, and we illustrate its ability to identify statistically significant clusters of selected granularity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Power evaluation of disease clustering tests

BACKGROUND: Many different test statistics have been proposed to test for spatial clustering. Some of these statistics have been widely used in various applications. In this paper, we use an existing collection of 1,220,000 simulated benchmark data, generated under 51 different clustering models, to compare the statistical power of several disease clustering tests. These tests are Besag-Newell'...

متن کامل

تجمع بیماری در مقیاسی وسیع و کاربرد آن در مطالعات اپیدمیولوژی و بهداشت

Spatial autocorrelation statistics provide summary information about the spatial arrangement of data in a map. In fact, these statistics compare neighboring area values in order to assess the level of large scale clustering. Whenever a large number of neighboring areas have either relatively large or relatively small values, large scale clustering may be detected. Detecting such clustering is a...

متن کامل

A weighted average likelihood ratio test for spatial clustering of disease.

We consider methods proposed for detecting localized spatial clustering. We propose a new test statistic, the weighted average likelihood ratio test, as an alternative to the spatial scan (maximum likelihood ratio) test statistic. Two different types of weights are considered. We propose an unbiased cluster selection criterion and evaluate the bias of the tests through simulation. We also exami...

متن کامل

Performance of cancer cluster Q-statistics for case-control residential histories.

Few investigations of health event clustering have evaluated residential mobility, though causative exposures for chronic diseases such as cancer often occur long before diagnosis. Recently developed Q-statistics incorporate human mobility into disease cluster investigations by quantifying space- and time-dependent nearest neighbor relationships. Using residential histories from two cancer case...

متن کامل

Selection of the Maximum Spatial Cluster Size of the Spatial Scan Statistic by Using the Maximum Clustering Set-Proportion Statistic

Spatial scan statistics are widely used in various fields. The performance of these statistics is influenced by parameters, such as maximum spatial cluster size, and can be improved by parameter selection using performance measures. Current performance measures are based on the presence of clusters and are thus inapplicable to data sets without known clusters. In this work, we propose a novel o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008